GitHub

您所在的位置：网站首页 › point cloud compression › GitHub

GitHub

2024-06-04 23:59| 来源: 网络整理| 查看: 265

3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods

Reference implementation of the publication 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods published on the 33rd IEEE Intelligent Vehicles Symposium (IV 2022) in Aachen, Germany.

This repository implements a RNN-based compression framework for range images implemented in Tensorflow 2.9.0. Furthermore, it implements preprocessing and inference nodes for ROS Noetic in order to perform the compression task on a /points2 sensor data stream.

3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods

Till Beemelmanns, Yuchen Tao, Bastian Lampe, Lennart Reiher, Raphael van Kempen, Timo Woopen, and Lutz Eckstein

Institute for Automotive Engineering (ika), RWTH Aachen University

Abstract — Storing and transmitting LiDAR point cloud data is essential for many AV applications, such as training data collection, remote control, cloud services or SLAM. However, due to the sparsity and unordered structure of the data, it is difficult to compress point cloud data to a low volume. Transforming the raw point cloud data into a dense 2D matrix structure is a promising way for applying compression algorithms. We propose a new lossless and calibrated 3D-to-2D transformation which allows compression algorithms to efficiently exploit spatial correlations within the 2D representation. To compress the structured representation, we use common image compression methods and also a self-supervised deep compression approach using a recurrent neural network. We also rearrange the LiDAR’s intensity measurements to a dense 2D representation and propose a new metric to evaluate the compression performance of the intensity. Compared to approaches that are based on generic octree point cloud compression or based on raw point cloud data compression, our approach achieves the best quantitative and visual performance.

Content Approach Range Image Compression Model Training Download of dataset, model weights and evaluation frames Inference & Evaluation Node Authors of this Repository Cite Acknowledgement Approach

Range Image Compression

This learning framework serves the purpose of compressing range images projected from point clouds captured by Velodyne LiDAR sensor. The network architectures are based on the work proposed by Toderici et al. and implemented in this repository using Tensorflow 2.9.0.

The following architectures are implemented:

Additive LSTM - Architecture that we used for the results in our paper. Additive LSTM Demo - Lightweight version of the Additive LSTM in order to test your setup Additive GRU - Uses Gated Recurrent Units instead of a LSTM cell. We achieved slightly worse results compared to the LSTM variant Oneshot LSTM - Does not use the additive reconstruction path in the network, as shown below.

The differences between the additive and the oneshot framework are visualized below.

Additive reconstruction framework

This reconstruction framework adds the previous progressive reconstruction to the current reconstruction.

Oneshot reconstruction framework

The current reconstruction is directly computed from the bitstream by the decoder RNN.

Model Input - Output

The range representations are originally (32, 1812) single channel 16-bit images. During training, they are online random cropped as (32, 32) image patches. The label is the input image patch itself. Note that during validation, the random crop is performed offline to keep the validation set constant.

Input: 16-bit image patch of shape (32, 1812, 1) which will be random cropped to shape (32, 32, 1) during training.

Output: reconstructed 16-bit image patch of shape (32, 32, 1).

Input Label Prediction

During inference, the model can be applied to range image with any width and height divisible to 32, as the following example shows:

Input range image representation:

Predicted range image representation:

Model Training Sample Dataset

A demo sample dataset can be found in range_image_compression/demo_samples in order to quickly test your environment.

Dependencies

The implementation is based on Tensorflow 2.9.0. All necessary dependencies can be installed with

# /range_image_compression >$ pip install -r requirements.txt

The architecture Oneshot LSTM uses GDN layers proposed by Ballé et al., which is supported in TensorFlow Compression. It can be installed with:

# /range_image_compression >$ pip3 install tensorflow-compression==2.9.0 Run Training

A training with the lightweight demo network using the sample dataset can be started with the following command:

# /range_image_compression >$ python3 ./train.py \ --train_data_dir="demo_samples/training" \ --val_data_dir="demo_samples/validation" \ --train_output_dir="output" \ --model=additive_lstm_demo

If you downloaded the whole dataset (see below) then you would have to adapt the values for --train_data_dir and --val_data_dir accordingly.

Run Training with Docker

We also provide a docker environment to train the model. First build the docker image and then start the training:

# /docker >$ ./docker_build.sh ./docker_train.sh Configuration

Parameters for the training can be set up in the configurations located in the directory range_image_compression/configs. It is recommended to use a potent multi-gpu setup for efficient training.

Learning Rate Schedule

The following cosine learning rate scheduler is used:

Equation

Parameter Description Example init_learning_rate initial learning rate 2e-3 min_learning_rate minimum learning rate 1e-7 max_learning_rate_epoch epoch where learning rate begins to decrease 0 min_learning_rate_epoch epoch of minimum learning rate 3000 Network Architecture

Default parameters for the network architectures

Parameter Description Example bottleneck bottleneck size of architecture 32 num_iters maximum number of iterations 32 Model Zoo Model Architecture Filename Batch Size Validation MAE Evaluation SNNRMSE iter=32 Additive LSTM additive_lstm_32b_32iter.hdf5 32 2.6e-04 0.03473 Additive LSTM Slim additive_lstm_128b_32iter_slim.hdf5 128 2.3e-04 0.03636 Additive LSTM Demo additive_lstm_128b_32iter_demo.hdf5 128 2.9e-04 0.09762 Oneshot LSTM oneshot_lstm_b128_32iter.hdf5 128 2.9e-04 0.05137 Additive GRU Will be uploaded soon TBD TBD TBD Download of Dataset, Models and Evaluation Frames

The dataset to train the range image compression framework can be retrieved from https://rwth-aachen.sciebo.de/s/MLJ4UaDJuVkYtna. There you should be able to download the following files:

pointcloud_compression.zip (1.8 GB): ZIP File which contains three directories train: 30813 range images for training val: 6162 cropped 32 x 32 range images for validation test: 1217 range images for testing evaluation_frames.bag (17.6 MB): ROS bag which contains the Velodyne package data in order to evaluate the model. Frames in this bag file were not used for training nor for validation. additive_lstm_32b_32iter.hdf5 (271 MB): Trained model with a bottleneck size of 32 additive_lstm_128b_32iter_slim.hdf5 (67.9 MB): Trained model with a bottleneck size of 32 oneshot_lstm_b128_32iter.hdf5 (68.1 MB): Trained model with a bottleneck size of 32 additive_lstm_256b_32iter_demo.hdf5 (14.3 MB): Trained model with a bottleneck sizeof 32 Inference & Evaluation Node

The inference and evaluation is conducted with ROS. The inferences and preprocessing nodes can be found under catkin_ws/src. The idea is to use recorded Velodyne package data which are stored in a .bag file. This bag file is then played with rosbag play and the preprocessing and inference nodes are applied to this raw sensor data stream. In order to run the inference and evaluation you will need to execute the following steps.

1. Initialize Velodyne Driver

To run the evaluation code we need to clone the Velodyne Driver from the official repository. The driver is integrated as a submodule to catkin_ws/src/velodyne. You can initialize the submodule with:

git submodule init git submodule update 2. Pull Docker Image

We provide a docker image which contains ROS Noetic and Tensorflow in order to execute the inference and evaluation nodes. You can pull this dockerfile from Docker Hub with the command:

docker pull tillbeemelmanns/pointcloud_compression:noetic 3. Download Weights

Download the model's weights as .hdf5 from the download link and copy this file to catkin_ws/models. Then, in the configuration file insert the correct path for the parameter weights_path. Note, that we will later mount the directory catkin_ws into the docker container to /catkin_ws. Hence, the path will start with /catkin_ws.

4. Download ROS Bag

Perform the same procedure with the bag file. Copy the bag file into catkin_ws/rosbags. The filename should be evaluation_frames.bag. This bag file will be used in this launch file.

5. Start Docker Container

Start the docker container using the script

# /docker >$ ./docker_eval.sh

Calling this script for the first time will just start the container. Then open a second terminal and execute ./docker_eval.sh script again in order to open the container with bash.

# /docker >$ ./docker_eval.sh 6. Build & Execute

Run the following commands inside the container to build and launch the compression framework.

# /catkin_ws >$ catkin build source devel/setup.bash roslaunch pointcloud_to_rangeimage compression.launch

You should see the following RVIZ window which visualizes the reconstructed point cloud.

7. GPU Inference

In case you have a capable GPU, try to change line 44 in docker_eval.sh and define your GPU ID with flag -g

$DIR/run-ros.sh -g 0 $@ Authors of this Repository

Till Beemelmanns and Yuchen Tao

Mail: till.beemelmanns (at) rwth-aachen.de

Mail: yuchen.tao (at) rwth-aachen.de

Cite @INPROCEEDINGS{9827270, author={Beemelmanns, Till and Tao, Yuchen and Lampe, Bastian and Reiher, Lennart and Kempen, Raphael van and Woopen, Timo and Eckstein, Lutz}, booktitle={2022 IEEE Intelligent Vehicles Symposium (IV)}, title={3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods}, year={2022}, volume={}, number={}, pages={345-351}, doi={10.1109/IV51971.2022.9827270}} Acknowledgement

This research is accomplished within the project ”UNICARagil” (FKZ 16EMO0284K). We acknowledge the financial support for the project by the Federal Ministry of Education and Research of Germany (BMBF).

【本文地址】

公司简介

联系我们